The Blizzard Challenge: evaluating corpus-based speech synthesis techniques

نویسنده

  • Alan W. Black
چکیده

The Blizzard Challenge was started in 2005 as a way to evaluate different corpus speech synthesis techniques on a common data set. It has been noted that it is very hard to evaluate different speech synthesis techniques when different size and quality databases are used to build a voice. To remove the variable of database size and speaker quality, we proposed a common database that all participants would use. The Challenge itself is for participants to take the given database (or databases) and build a voice using their voice building software. After a short time, a set of test sentences are released that are to be synthesized by each participants' system. The synthesized utterances are collected together and a webbased listening test is set up. Two types of listening tests are carried out, a simple MOS based test, and a set of understandability tests where the listener is asked to type in what they hear. Three sets of listeners are used: speech experts (provided from the participants' groups), volunteers (collect by web advertising), and paid undergraduate native speakers. Each year the results have been presented at a workshop where participants present descriptions of their systems, and final results are given. The challenge has brought together groups from academia and industry from around the world. Both established groups, and new groups have been represented. The results have been both interesting and unexpected. But we see the Challenge as a long term evolving event. Modifications in the basic structure are being considered each year. For example: how to test if speaker identity is preserved in voice conversion based systems; how can we test multisentence synthesis; what about multi-lingual databases; and who is going to run it. No individual results will be presented in this talk, but overall trends will be given as well as discussion of future directions for Blizzard. A more detailed description of the motivation and details of the challenge is described in [Black and Tokuda 2005]. All the presentations including anonymized results are also available on line at http://festvox.org/blizzard/

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The blizzard challenge - 2005: evaluating corpus-based speech synthesis on common datasets

In order to better understand different speech synthesis techniques on a common dataset, we devised a challenge that will help us better compare research techniques in building corpusbased speech synthesizers. In 2004, we released the first two 1200-utterance single-speaker databases from the CMU ARCTIC speech databases, and challenged current groups working in speech synthesis around the world...

متن کامل

MILE TTS for Tamil for blizzard challenge

Our participation in the Blizzard Challenge 2014 is only for the Tamil language. We have a unit selection based concatenative speech synthesis system. Sentence level viterbi search is used to select the reliable speech units among a set of candidate units. The given RD (reading), SUS (semantically unpredictable sentences) and ML (multi‐lingual) test sentences are synthe...

متن کامل

The USTC System for Blizzard Challenge 2012

This paper introduces the speech synthesis system developed by USTC for Blizzard Challenge 2012. An audiobook speech corpus is adopted as the training data for system construction this year. Similar to our previous systems, the hidden Markov model (HMM) based unit selection and waveform concatenation approach is followed to develop our speech synthesis system using this corpus. Considering the ...

متن کامل

PKU Mandarin Speech Synthesis System for Blizzard 2009

This paper describes the development of PKU mandarin speech synthesis system for Blizzard Challenge 2009, which is built in the framework of corpus-based unit concatenation synthesis. The system employs a trainable VTR model named HTM to label the VTR trajectories in corpus and predict the target VTR features. In addition, a CART based prosody model is built to predict the prosody parameters of...

متن کامل

The VoiceText Text-to-Speech System for the Blizzard Challenge

This paper introduces the VoiceText text-to-speech system developed by Voiceware. By means of corpus based concatenative speech synthesis technique, we built high quality synthetic voices using the dataset provided for the Blizzard challenge 2007. The evaluation results show that VoiceText achieved high performances in both naturalness and intelligibility of synthesized speech.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007